DOCUMENT RESUME 



ED 395 995 



TM 025 194 



AUTHOR 

TITLE 

PUB DATE 
NOTE 



PUB TYPE 

EDRS PRICE 
DESCRIPTORS 

IDENTIFIERS 



Jiang, Hai; And Others 

An Estimation Procedure ror the Structural Paramet ers 
of the Unified Cogni t i Ve/IRT Model. 

8 Mar 96 

23p.; Paper presented at the Annua 1 Meeting of the 
National Council on Measurement in Educat ion (New 
York, NY, April 9“11, 1996). 

Reports ~ Evaluative/Feasibility (142) — 
Speeches/Conference Papers (150) 

MFOl/PCOl Plus Postage. 

*Cogni t ive Processes; ’'Es timation (Mathemat i cs) ; 

*Item Response Theory; ^Maximum Likelihood 
Statistics; Psychometrics; *Test Construction 
*EM Algorithm 



abstract 

L. V. Di Bello, W. F. Stout, and L. A. Rous s os (1993) 
have developed a new item response model, the Unified Model, which 
brings together the discrete, deterministic aspects of cognition 
favored by cognitive scientists, and the continuous, stochastic 
aspects of test response behavior that underlie item response theory 
(IRT) . The Unified Model blends psychometric and cognitive science 
viewpoints and promises to allow the practitioner to recover 
cognitive information from simple, wel l”des i gned tests. This paper 
proposes an estimation procedure for the structural model parameters 
of the Unified Model that uses the marginal maximum likelihood, 
estimation approach of Bock and Aitkin (1981) and the EM algorithm of 
A. P. Dempster, N. M. Laird, and D. B. Rubin (1977). In the 
maximization (M) step of the EM algorithm, because of the 
difficulties in computing the second derivative (Hessian) matrix and 
the possibility of multiple local maxima, using an alternative 
maximization procedure is proposed. This procedure, called Evolution 
Programming (Z . Michalewicz, 1994)^ has good properties in finding a 
global extremum. A simulation study is then given to show the 
effectiveness of the estimation procedure. (Contains 2 figures, 7 
tables, and 16 references.) (Author/SLD) 






r ;‘r i: Vc * V: V: Vc V: >'r >'r i: >'r >‘r * >',* >‘r 



Reproductions supplied by EDRS are the best that can be made 
from the original document. 






r V: >': V: V: V: %'t V; V? it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it it i: it i: it y. 




An Estimation Procedure for the Structural 
Parameters of the Unified Cognitive/IRT Model 



Q 

LU 



Hai Jiang 

University of Illinois at Urbana-Champaign 

Louis DiBello 

Law School Admission Council 

William Stout 

University of Illinois at Urbana-Champaign 



March 5, 1996 



fPCOi'vOd . -q r' / I* 

vi.'-yioat'pg tj 




U S DEPARTMENT OF EDUCATION 



Tty hriPUlCT.'Ub ^ 
;jisstf.iinATu Thi^ f.iAi i-f-.iAi. 

HAS Cit^ANhUi/ B' 








TO THU T'JuC AT10*«‘‘U i-lU-OLil 
I H f OH' Of ‘ 1 MT i ' • t f " 



Lk 



Pa|)or prc’scMitod at the IflDO X(\MK annual un’olinR, Now York, X>', April X, inHfi. 




BEST COPY AVAILABLE 



/w 



ERIC 



Abstract 



DiBello, Stout, and Roiissos (1993) have developed a new item response model, called the 
Unified Model, which brings together the discrete, deterministic aspects of cognition favored by 
cognitive scientists, and the continuous, stochastic aspects of test response behavior that underlie 
item response theory. The Unified Model blends psychometric and cognitive science viewpoints 
and promises to allow the practitioner to recover cognitive information from simple, well-designed 
tests. 

In this paper, we propose an estimation procedure for the structural model parameters of 
the Unified Model that uses the marginal maximum likelihood estimation approach by Bock and 
.Aitkin (1981) and utilizes the EM algorithm by Dempster, Laird, and Rubin (1977). In tlu' 
maximization (M) Step of the EM algorithm, because of the difficulties in computing the second 
derivative (Hessian) matrix and the possibility of miih.iple local maxima, we propose using an 
alternative maximization procedure, called Evolution Programming (Michalewicz 1991), which 
has good properties in finding a global extremum. A simulation study is then given to show the 
effectiveness of our estimation procedure. 

Key words: Unified Model, cogtutive, item res[mnse theory, marginal maximum likelihood 
estimation, EM algorithm. Evolution Programming. 



1 Introduction 



DiBcllo, Stout, and Roussos (1993) have proposed a new psychometric approach to cognitive 
diagnostic assessment. They develop a new item response modf^l, called the Unified Model, which 
brings together the discrete, deterministic aspects of cognition favored by cognitive scientists and 
the continuous, stochastic aspects of test response behavior that underlie item response theory. 
The Unified Model blends psychometric and cognitive science viewpoints and promises to allow 
the practitioner to recover cognitive information from simple, well-designed tests. 

The ultimate goal of developing the Unified Model is to enable practitioners to cognitively 
classify the test takers and estimate their cognitive abilities, thereby extracting useful information 
about the test takers underlying cognitive processes on the test and their cognitive strengthes 
and weaknesses. To achieve this goal, it is essential to be able to estimate the model parameters 
before we can go on to classify examinees and estimate their abilities. Unfortunately, due to its 
structural complexity, until now tliere has been no estimation package available for the Unified 
Model. There are difficulties associated with the estimation problem, the main and foremost 
problem being the identifiability and estimability issues involving the model parameters. In this 
paper, we first give a brief overview of the Unified Model from the cognitive diagnosis viewpoitil. 
After the overview, we will discuss briefly the relationship between the deterministically predicted 
ideal response patterns and the attribute states, as well as the identifiability and estimability 
issues involving the item parameters and the latent ability distribution parameters. 

The third section of this paper concerns the estimation of the structural parameters of the 
Unified Model. In this section we propose an estimation procedure using tlie marginal maxi- 
mum likelihood estimation approach by Bock and Aitkin (1981). Since we cannot maximize the 
marginal likelihood di^-ectly, we utilize the EM algorithm by Dempster, Laird, and Rubin (1977). 
In the maximization (M) step of the EM algorithm, because of the difficulties in computing the 
second derivative (Hessian) matrix and the possibility of multiple local maxima, we propose' us- 
ing Evolution Programming (Michalewicz, 1994), which has good properties in finding a global 
extremum. 

In the fourth section of this paper, a simulation study is present showing the ('{fectivc'iiess of 
our proposed estimation i)rocedur(\ We one- the ])aper with a summary section. 



O 

ERIC 



2 The Unified Cognitive/IRT Model 



2.1 Review of the Unified Model 

The traditional role of tests in Education and Psychology is to rank the examinees and/or 
judge their proficiencies within a broad area of knowledge. For example, the GRE tests examinee 
proficiencies on verbal, quantitative, and analytical reasoning skills. In cognitive diagnosis, when 
giving a te^t, the test developers and administrators are interested not only in judging exami- 
nee proficiencies in a specific area of knowledge, but also in getting information on examinees' 
underlying cognitive processes used on the test. This happens in the usual classroom sotting: 
when giving a test, a teacher not only wants to know which grade Johnny gets, but perhaps 
more importantly, she wants to know whether Johnny has really mastered the Algebraic Rules 
of Exponents. In other words, she wants to assess the examinee mastery on a variety of cognitive 
attributes. An attribute represents a cognitive quality required for solution of a test item: it 
can be anything based on the procedures, skills, proce:5scs, strategies, or the knowledge that an 
examinee needs to possess to solve the item. 

There are two distinct approaclics in cognitive diagnosis: the continuous mult idimensional 
latent trait approach favored by nsychometricians and the discrete approach favored by cognitive 
scientists, hi the usual latent trait approach, a few broadly described continuous latent, traits 
are postulated to account for systematic examinee response behavior on a test. ;\s Snow and 
Lohnian (1989) noted, this approach has a weak cognitive foundation. Although it sonietimes 
sounds like the imiltidiinensional underlying latent traits are cognitive in nature', it is geiK'rally 
agreed that this approach has only been successful with broad, composite abilities. In particular, 
for example, it is of little help in trying to determine specific cognitivechararteristicsof examinees 
for the purpose of instruction. 

For the discrete approach, an example is latent class analysis (see for example, Lazcrsfeld and 
Ih'iiry, 1968). .A latent class analysis involves the postulation of a number of latent classes, hi a 
lat('iit class analysis, ('xaniinee ability is not repix'sented as a continuous variable on dimensions 
defined by the cognitive components. Instead, it is modeled by a vector of Is and Os indicating 
for each cognitive component whether an examinee does or doc's not possess the skills needed for 
succx'ssfnl performance on the eoinponent. Latent class aiialys('s ('it her involve a large' nniiib('r 




\ ) 



of classes so that it is infeasible for estimation, or there are only a. few latent classes that the 
results are similar to multidimensional latent trait analyses in their coarseness of latent structure 
assumed (Bock and Aitkin, 1981; Bartholomew, 1987; Takane and de Leeuw, 1987; Haertel, 
1990). 

The Unified Model approach blends psychometric and cognitive science viewpoints. It is 
based upon a new item response model, called the Unified lVl.odel. Below we will give a brief 
review of the Unified Model. 



Following Tatsuoka (Tatsuoka, K.K., 1984, 1985, 1990; Tatsuoka, K.K. and Tatsuoka, M.M., 
1987), we consider a test of length I with K postulated cognitive attributes, and a matrix 
Q = {qk,)Kxh where 



qki = 



l if item i requires attribute k 
0 if not 

The K attributes include those of interest for cognitive diagnosis, as well as others inadvertently 
present in the test. The Q matrix specifies which attributes must be mastered in order to 
correctly answer each item. 

The Q matrix represents a presumed choice of strategy for each item. By strategy we mean 
the steps that are used in answering the item. 

Let o = (O'!, .... Q'/c)^ be a vector denoting an examinee’s attribute state, where denotes 
the transpose of vector x, and 



{ 1 if examinee has mastered attribute k 
0 if not 

A given exam.inee attribute state a = (oi o/c)^. along with the Q matrix produces the ideal 

ixsponse palUvn associated u'ith a and detennined by Q: 



.rq(tl) = (.ri xif 



( 1 ) 



It is defined as follows: for the idea’ response, item i is answered correctly if the examinee pos- 
ses.ses all the attributes as required by Q for this item; otherwise, item i is answered incorrectly. 
In mathematical terms, 

{ 0 if there is an attribute k for which e/k, = 1 but = 0 
1 if not 







3 



In reality, examinee responses are seldom consistent with such a simple deterministic model. 
We expect examinee responses to differ from the ideal response patterns. The Unified Model 
approach models the probabilistic variation in examinee responses by incorporating the following 
four major sources of response variation. 



Strategy: 
Completeness: 
Posit ivitv: 



Slips: 



The examinee rnav use a different strategy from that presumed by the 
Q matrix. 

An item may require attributes lliaJ. are not listed in the Q matrix. If 
so, we will say the Q matrix is incomplete for the U .m. 

In some cases, an examinee who possesses an attribute will fail to 
apply it correctly to an item, and another examinee who lacks the 
attribute will apply it correctly to the item. If siK'h response 
behavior is prevalent among the examinees, we will say the attribute 
is low positive for the item. 

The examinee mav commit a random error. 



To allow for cases in which multiple strategies arc used by examinees and cases in which the Q 
matrix is incomplete, the notion of a latent residual ability i] is im .duced in the Unified Model. 
Hence under the Unified Model, the complete latent ability for an examinee is £ = “ 

...,a/v)^, where a is the examinee's attribute^ slate and ?/ is his residual ability. 

Under the Unified Model, the probability of an ('xaniiiu'e answering item i correctly (denoted 
V- — 1) giv^en that he has ability is 



- P(v;- - l|//,a; d.) (1 - + 2rv) + (1 






wrien' 



j) := probability of a random slip 

d{ ~ probability of selecting Q strategy for it(Mii i 

c, = completeness index of Q matrix for item i 



ERIC 



K 

•^' 2 ..' ~ = ^(applying required attributes correctly to i]a) 

k=i 

Kki = ^(applying attribute k correctly to ?|q/; = 1) 

Tki = 'P(applying attribute k correctly to z|o';; = 0) 

H iv) — "ni"? — TT — ^ parameter logistic (IPL) with difficulty /;,■ 

* I _|^ ^ — I . / L 1 (t; — 0| ) 

— ( ki 1 c-i , d,' , TT j , ... 7r/\',' > ? 1 M • • • 1 ^ 1 > • • • 1 ^ 

d he four sources of response variation are incorporated in the Unified Model through the 
parameters p, c,’s, d,’s, the tt’s and the r’s, for example, the k's and /’'s used to model the 
positivity of the attributes. For a derivation of the model, the reader is directed to UiBello. 
Stout, and Roussos (199-3). 

fVoin now on, we assume p = 0 (no slip) for all items. Thus the probability of answering item 
i correctly given examinee ability becomes 

P,(P<Q:£.) - + 2c-) + (1 - (-1) 

2.2 Ideal response patterns 

A given examinee attribute state o, along with the Q matrix produces the ideal response 
pattern associated with o as defined by (1) and (2). Since we postulate K attributes, there are 
2^' different attribute states. The number of different ideal resi)onse patterns, l.owever, is usually 
smaller than 2^' because of the fact that different attribute states may produce exactly the sanu' 
ideal response patterii. 

Example 1 : For the Q matrix given below. 



\ 



1 


0 


0 


1 


0 


0 


1 


1 


0 


0 


1 


0 


1 


0 


0 


1 


0 


0 


0 


1 


0 


1 


1 


0 


0 


1 


0 


0 


1 


0 


0 


1 


1 


0 


1 


0 


0 


1 


0 


0 


0 


1 


0 


1 


1 


0 


0 


0 


1 


0 



both attribute states (1.0. 1. 1.0)'^ and (1.0,0,1.01^ produce the same ideal response' pattern 
(0.0. 0.0, 0,0, 1. 1,0,0)'^'. 



O 

ERIC 



6 



Definition 1: For two attribute states O; = (an and = (021 A 2 a)^, <li is a 

siibstate of ^2 and is denoted by aj < O 2 , if ou- < ct 2 h for k = \ l\ . 

In example 1, attribute state (1,0,0, 1.0)^ is a substate of attribute state (1.0, 1, 1,0)^ . 

Definition 2: Among all the attribute states that produce the same ideal response pattern, 

tlie canonical state is the one that has the smallest number of I’s. 

In example 1, (1,0, 0,1,0)^ is the canonical state that produce's the ideal resj)onse pattern 

( 0 , 0 , 0 , 0 , 0 , 0 . 1 , 1 , 0 , 0 )^'. 

It can be shown that the canonical state as givc'ii by the above definition is unique'. 

Definition S: An attribute state a = (qi oa)'^ is a direct sum of two attribute state's 

n, = (a 11 a ia )^ and 02 = ( 021 . 02A')^ ' denoted by n = Oj V 'f 

nr- = nir V 02 r - mn.r(au-. <^ 2 k)- for A' = 1 I\ 



Whether an attribute stale is a canonical state can i)e determined by the following proposition. 

Proposition /: An attribute state is a canonical state, if 

• it is the attribute state of all I's or the attribute state of all O's. 

• it is a column of the Q matrix, or 

• it has substate's that are columns of the Q matrix and it is the direct sum of the?se substates. 



ExampU 2: For the Q matrix given below. 



11100 
0 0 110 
0 0 0 11 
0 0 0 0 1 



attribute st ate (0. 1 , 1 . 1 is a canonical state, because it has substates (0. 1 . 1.0)'^ and (0. 0, 1 . 1 
that are columns of Q. and the dire'ct sum of these two substatc's is (0. 1, 1, 1)^ . 



As results of the above proposition, we have the following corollarie-s concerning the' number 



e)f canonical state's. 



Corollary I: If among the K attributes postulated, only K' are reciuired by all the items, 
there will be at most 2^’ canonical states. 

In this case, there are K — K' attributes not required by any item; in other words, they an' 
redundant. Below we assume this situation never happens. 

Corollary J: Consider all the items each requiring a single attribute, if the number of different 
attributes required by these items is A ', there will be at least 2^' canonical states. 

The canonical states can now be used as representatives of attribute classes, so we can index 

the set of all ideal response patterns, or the set of attribute classes by / = 1 L. and replace 

attribute state a by index / in our notation heretofore. 'Fhe latent space can now be thought ol 
as eR,/=l L}. 

For the distribution of latent ability (?/./)"'^. we assume in our model a finite mixture of 

normals with tne mixing probability pi and for giv'en /. i.c.. the density of latent ability 

distribution ^ 

tt{0:q) = -{rjJio) = pi ~-i =- ] ( 5 ) 

~ ~ v27ra 

where the latent ability distribution parameters 

0 = (pi pi..p\ PL-o-'^y 

2.3 Log likelihood function 

Suppose there are a total of S examinees. Let B — b(' the totality ol itcuii 

parameters, and assume the latent space is complete with respect to the latent ability vector ^ 
so that the local independence given 0_ holds 

= n /"(>;, |^:T) = H 

!'=I 1=1 

where V„ is the response vector for examinee' u. 

'File marginal likclihoocl function, which is the likelihood function given the response matrix 
Y and the matrix of latent abilities © integrated over tlu' latent ability distribution, is givc-n by 

/4B.o|Y) = j h(B,olY,0)/'V/©) = n / /'(V„|^:B)7r(0;p)r/0 

n = l 

7 

.5 0 

BEST COPY AVAILABLE 



Horc \vc have used the independence of exaniim'es to factor the likelihood function £(B. o|Y. ©) 
and to factor F(d0) into a product measure. 

Taking the logarithm and using (o). the log likelihood function given Y is 

InilB.ojY) = EinX: / (6) 

n = l / = V-TTCr 

2.4 Some model identifiability and estimability issues 

There are various identifiability and estimability problems involving the parameters of the 
l;nified Model. First we give two definitions, whicdi deal with two different causes for b('ing 
unable to estimate model parameters. 

Definition 4- If ^ probability model /^(ylo) is parametrized by a vector o. we say the mod('l 
parameter o is not identifiable if there exists ^ d' ^ 6 such that for all y 

p{y\o) = p(yW) 

i.c., the distribution of y is the same for 6 and o^ In this case, the data simply cannot yield any 
information to distinguish o from c>^ Imrther, if w constraints (e.g., fixing m compoiunits of o) 
ixMidor the n-dimensional o identifiable, vve say that n w of the coiiiponents are ident iliable. 

Definition 5: If a model p{y\(p) parametrized by a vector o = (Oi On)^. wo say a 

component of the model parameter (p is not estimable if the model does not actually invohx' 
the comi^oneiil . In this case, the data simply cannot yield any information about Oj--. 

First, let us consider an identifiability problem that arises involving the tt's and r's. 

Proposition 2: For item ^ let /v,- = th^ iiiimhcr of attributes recjuinxl by /. 'rium 

A:=l 

among the 2/\, of tt's and r's for which r/;., = 1, only A, -[- 1 of them are identifiable. 

This identifiability problem is caused by the nonlinear constraints among the resulting 

from their being products of th(' tt's and r's. For illustrative purposes, suppose there’ are just 
two attributes and item 1 ro([uires both. Re'call (3) and cotisider its S's. Then we have four 
tt's and r's (^i i . r 1 1 , ) to be estimate'd, or ecjuivahmt ly four A's to Ix’ ostimatc'd (.S'n = 

7rii7r2i, A -21 — and S.u ~ ?*nr 2 i. Here’ tlie first index of the AT denotes tlu' 



attribute states (1^1)^, (1,0)^, (0,1)^, and (0,0)^, respectively). Since 5.u = S 2 iS:^i/ Su, only 
three S’s are identifiable* or equivalently only three of the four tt's and r's are identifiable. 

To resolve the above identifiability problem involving the tt’s and r's* if an item i requires /\.- 
attributes, we will fix the first h'i — 1 tt's at 1* leaving only the last tt free, so that item i now 
has only Ki + 1 free tt and r ])arameters. 

Next, let us look at the identifiability issue involving the 6,- and ///, recalling (3) and (5). 

Proposition 3: If holding all other parameters fixed, and adding the same constant to every 
hi and every /q, the log likelihood function liiL(B.c>|Y) will not change. Note that this is of 
course the usual identifiabilit}^ problem occurring with ordinaiy IRT logistic modeling. 

Proofs of the above results can be found in .Jiang (1996). There are other identifiability and 
estiinability problems; below we give some examples. If r/,- = 1 (i.e.. we are certain the Q strategy 
for item / will be selected by all examinees), the Unified Model for the ith item becomes 

Since . 

n.(// + 2cd = n.-2..(^/) 

we cannot estimate and c^ separately when di = L In the s(mis(' of Definition -1. h, and c^ an' 
unidentifiable when = 1. because diffen'nt sets of {bi^cp with the same' value at hi — 2Ci will 

give the same Similarly, we can argue tliat if r/j is close to 1. we will not have ('noiigh 

iiiroriiiat ion from the data to amiratc'ly ('stimaic' hi and (\ s(']>arat( ’y, but rathei* can ('stiinat(' 
them togetluM' through the linear combination hi — 2cj. 

If di = U, then 

In this cas(', W( can.not estimate for any ])ossihl(' o. nor ran wo estimate' c,- (i.e.. and r, 
an' not ('st imablc' in tlu' sense' of Dc'fmiticni o). 

SiiK'e di is the probability of seh'cting Q stratc'gy for item d W(' can normally assume d, 
is bounded away from f), unless the Q matrix under considc'ratioii is badly const met ('d from 
('ogiiitiv(' |)('i'sp('('tiv('. 




9 

BEST COPY AVAILABLE 

I/. 



3 Estimating the Structural P;irameters of the Unified 
Model 

Since directly maximizing In L(B, f'Y) over B and o is infeasible, we use the HM algorit hm, 

3.1 EM algorithm for the Unified Model 

Tlie EM algorithm, as its name suggests, is divided into two steps: the K (exi)e(Mat i. st(‘p, 
and the M (maximization) step. C'yclical application of the E step and the M step continues till 
a certain convergence criterion is met. 

In the E step, the conditional expectation of log likelihood of complete data gi\'en the incom- 
plete data and current parameter estimates is computed. In oiir study, the incomplete data is 
the observed response matrix Y and the complete data is the responses plus tlu* examinee latent 
ability matrix ©. So in the E st(‘p, the following quant it\ is computed 

Q(B,o;Bbo') = f:[hi A(B,o|Y,0)lY;B',o'] 

where the expectation is taken with respect to 0. Here B' and o' are the parameter esliniatc’s 
resulted from the M step in the previous iteration. Here and below we follow the standard 
notation in the literature of P]M algorithm. It is understood that the Q functions in tlu' K and 
M st('ps depend on the observed response matrix Y. 

It turns out that for tlu' rnifi('d Moded the following (h'coinposit ion holds 

1 

Q(B.o:B'.o') = Qo(o; B'.o') + ^ B'. ^ ) 

r = l 

wlu'ro Qo(o:B'.o^) iiivoK'c's only tlu' ability distribution paranu'Ic'r O and (uicli Q,f Bb c/ 1 
im'oh'es only tlu* item stnictim' paranu'tc'i* for itcun ?. 

In the M step, Q(B,o;B'.0^) is maximizcxl over tlu' paraimd(U*s B and o f(jr given . o and 
Y. Because of the decomposition in the step, we can separatedy maximize Qo(o- Bh c/) over o 
and each B', o^) over 

Whih' there exists a closed-form solution o for inaxiniizing Q[)(o:¥i' , o' } over o, no (doscvl- 
form solution ('xists for maximiziiig B', <j') ovf’r Hera use' of the fli(li(’ult ies in computing 



the cl(*rivativos, especial!}' the second derivative (Hessian) matrix, and tlie possibility of many 
local maxima, the conventional Xewton-Raphson method is not feasible to apply. So instead of 
using Xewton-Raphsoin we use Evolution Programming (Michalewic;^, 1991) to maximize each 
QRJ/.B'.o') over 3^, 

Any optimization task can be thought of as a search through a space of potential solutions. 
lA’olution Programming is a stochastic algorithm whose search method emulatc's the natural 
phenomena of genetic inheritance and Darwinian strife for survival. An Evolution Programming 

maintains a |)opulation of individuals P{l) = {x\ for use in iteration t. l^ach individual 

is a vector and represents a potential solution to the problem at hand (i.e., a potential optimizer 
of the probhmi). Each solution .r[ is evaluated to give some measure of “ fitness". Then, as a 
result of iteration f a new |X)pulation P(f + 1) for use in iteration / + 1 is forim'd by scT'cting tlu' 
more “lit" individuals (the select stej)). Some members of this new po|)ulation undergo trans- 
formations (alter step) Iw means of “genetic" operators to form new potcmtial solutions. I h(‘re 
are unary transformations (mutation type), which create new individuals by a small change in a 
single individual, and higher order transformations (crossover tv|)e). which cremate* ik'W individu- 
als by combining segments from several (two or more) individuals. After several generations the 
program convcM'ges with tlu'gcjal being that tlu' l)cst individual in this final generation r('pr('S(uit s 
a near-opiimum solution. 

For our j)roblems at hand, since w(' are iiiaxiiiiizing (vicli Qi{3^: , B'. c/) (A(M’ t lu' |) 0 |)ulat ion 
of individuals are vexiors of possible values for 3^. One iteration consi‘>is of o|)erations such 
as mutation, crossoven*, and selection. Evolution continues through gem'rations until a certain 
mnvergeiK'e ac(“ura(W is uhtain(*d. I luMi t Ix' inaxin ii/<‘i' 3^ of ()j( ^ gi^‘en h\' the l)('->t 
solution vc'ctor of tlu' final genuTat ion. 



4 Simulation Study and Results 

We us(' a carefully constnictcul Q matrix with b postulatc'd attributes, and an exaniinet' |)o|.j- 
ulation having a selecttnl subset of all 32 pussibh' attribute' states to dc'monstrate the elh’ct i ven('>'^ 
of our estimation |)roc('dur('. 

We postulate 10 ite'ius. The Q matrix and th(‘ item |)arameter sHtings are given 




14 



in Tal)lc 1. riio rationale of rlioosing tho itrm parameter sc'tting as given in 'I'able 1 is that 
the comj)leteness index e, of the Q matrix is moderate to high. That is, it is relaliv(dy a 
single strategy test (i.e., r/,'s are rlo.se to 1 so that we are fairly sure an examinee will choose 
the single strategy postulated by the Q matrix), and the positivity is high so that the tt's are 
elosc to 1 while the r’s are small. With a set of well defined attributes and reasonably well 
constructed Q matrix, our choi>'e of the item parameter .setting is quite plausible. The simulated 
test consists of 10 items obtained by replicating each core item -1 times. For the Q matrix 
l)ostulated, there are a total of 21 possible attriluite classes (i.(>., there ai'(' only 24 different ideal 
resi)onse |)atterns ix'sulting in 21 attribute classes from the 22 attribute states). 1000 examinees 
are generated by assuming that only 10 of the 21 attribute classes actually occur in the examinee 
])opulation. Table 2 gives the latent ability distribution ])arameter settings for the 10 classes, 
along with tlu'ir reprc'sent at i\'c' canonical states and their ideal res|)ons(' |:)att('rns. In lahh' 2 and 
sub.secinent tables, the column lahle IR refers to id(>al res|)onse |)attern number. Note that whih- 
our choice of the /q is somewhat arbitrary, the /// ordering is consistent with the ])artial oixh'ring 
existing among the attribute states. Because' the emi)irical work needed to find "re-alist ic" model 
|)arameter values (B. o) has not Ih'cii done'. \\e ha\c been forcc'd to select what seem to be 
|)lansihle values for the model parameters. 

Recall that the I'nified Model for the dh item is giv(>n by (1). The estimated itt'in and latent 
ability distribution i)arametc'i's as a ix-sult of our FM algorithm run are giv(>n in I'ables 3 and 1. 
rc>[)cctively, 



O 

ERIC 






Table 1. Q matrix ami true item parameters 



itoni 




attr. 1 


attr. 2 


attr. 3 


attr. 4 


attr. 5 


bt 


Cl 


d. 


1 


7T 


0.9 


0.8 








-0.3497 


0.90 


0.90 




r 


0.1 


0.4 
















T 






0.8 


1.0 


0.9 








2 














-0.1929 


0.70 


0.95 




V 






0.4 


0.3 


0.1 










7T 




1.0 


0.8 


0.8 










3 














-0.0082 


0.90 


0.65 




r 




0.1 


0.3 


0.2 










4 


7T 


1.0 








1.0 


0.0189 


0.60 


0.95 




r 


0.0 








0.4 


















1.0 


0.9 








5 














-0.1832 


0.90 


0.95 




r 








0.2 


0.3 














0.8 


1.0 












6 














0.9343 


0.70 


0.95 




r 




0.1 


0.2 












i 


7T 


0.9 










-0.3339 


0.40 


0.85 




V 


0.3 
















S 


7T 


0.8 






0.9 




-0.1006 


0.90 


0.95 




r 


0.4 






0.0 












7T 






1.0 




1.0 








9 














1.0964 


0.90 


0.95 




r 






0.4 




0.3 








10 


7T 




0.8 








0.2996 


0.60 


0.95 




r 




0.4 















Table 2. True latent ability distribution ])araineters 



IR 


attribute 

slate 


ideal 

response 


t rue 
PI 


true 

III 


( 


00111 


01001 00010 


0.09 


-0.057 


!) 


01 101 


00000 10011 


0.1 1 


-0.760 


2 


01111 


01101 10011 


0.08 


0.433 


1 1 


10011 


00011 01100 


0.1 1 


-0.392 


3 


10111 


01011 OHIO 


0.09 


0.551 


24 


11000 


10000 01001 


0.11 


-0.727 


4 


11011 


10011 01101 


0.10 


-0.964 


5 


11101 


10010 11011 


0.10 


0.584 


6 


11110 


10100 1 1 101 


0.11 


0.448 


1 


11 11 1 


mil mil 


0.10 


0.762 


rr 


1.00 




Id 



H 






Table 3. Estimated item parameters 



item 




attr. 1 


attr. 2 


attr. 3 


attr. 4 


attr. 5 


b, 




d, 


1 


7T 

r 


1.00 

0.17 


0.72 

0.37 








-0..533 


0.80 


0.97 


2 


7T 

r 






1.00 

0.43 


1.00 

0.28 


0.74 

0.05 


0.219 


0.90 


0.93 


3 


IT 

r 




1.00 

0.05 


1 .00 
0.49 


0.69 

0.07 




-0.009 


0.70 


0.65 


4 


7T 

r 


1.00 

0.00 








1.00 

0.38 


0.102 


0.60 


0.94 


5 


7T 

r 








1.00 

0.22 


0.90 

0.26 


-0.272 


0.90 


0.97 


G 


7T 

r 




1.00 

0.07 


0.83 

0.20 






1.165 


0.80 


0.85 


i 


7T 

r 


0.S6 

0.17 










-0.477 


0.50 


0.66 


s 


7T 

r 


1.00 

0..50 






0.70 

0.00 




-0..390 


0.90 


0.98 


o 


7T 

r 






1.00 

0.33 




0.97 

0.27 


-0.090 


0.30 


0.95 


10 


7T 

r 




0.8 

0.39 








0.770 


0.80 


0.85 



Tal)le 1. True and estimated ability distribution })arameters 



IR 


attribute 

state 


ideal 

response 


true 

Pi 


est. 

Pi 


true 

PI 


est. 

PI 


7 


00111 


01001 00010 


0.09 


0.0631 


-0.057 


0.1933 


9 


01 101 


00000 10011 


0.1 1 


0.0975 


-0.760 


-0.6773 


2 


01111 


01101 10011 


0.08 


0.1037 


0.4.33 


-0.0190 


11 


10011 


00011 01 100 


0.11 


0.1170 


-0.392 


-0.1286 


3 


10111 


01011 OHIO 


0.09 


0.1033 


0..551 


0.5177 


24 


11000 


10000 01001 


0.11 


0.1175 


-0.727 


-0.5631 


4 


non 


10011 01101 


0.10 


0.0930 


-0.961 


-0.7296 


5 


11101 


10010 11011 


0.10 


0 0837 


0.584 


0.5579 


6 


lino 


10100 11101 


0.11 


0.1004 


0.4 18 


0.9197 


1 


11 11 1 


inn Hill 


0.10 


0.0940 


0.762 


0.8984 


f'sl. CT 


0.931 




17 



From 'I'ahlo ■!, we notice that the true and estimated mixing probabilities are close, while the 
estimated /:/; values are often not close to tlu' true /p. Because of the way the ///‘s function in the 
likelihood through the latent ability distribution, it is relatively more difficult to estimate them 
accurately, especially when the possibility of relatively flat likelihood surface exists. I'aking this 
into account, we think the estimated /// values are satisfactory (see also the comment below on 
the comparison between the likelihood at the true and estimated parameters). Note that we 
start the KM algorithm run with equal mixing probabilities initial values for all the 2-! 

possible attribute classes. From Table 4, we see that our procedure selects tin' right 10 classes, 
the estimated mixing probabilities for the other 14 are all ap])roxiinately 0 (with an av('rag(' of 
0.0019). as desired. Hence their estimated values are not gi\en. 

Because of Proposition 2, different sets of tt's and r's can generate the same set of .S’^,,’s. In our 
estimation ])rocedure, remember that for an item ? that recpiircs A',- attributes we (arbitrarily) 
fix the first A'. — 1 of the tt's at 1 to reduce the indeterminacy among the tt's and r's. .Ns a 
result, the parameter estimates for sonu' items may appear to be far away from their true values. 
Because of this problem, to determine the estimation accuracy of item parameters, we net'd to 
instead compare the estimated .SN.i's with their true \-ahu's using the estimated and tnu* values 
of tt's and r's. From Fable .3 it is clear that for items 4. .5. 7. 9. and 10 tlu' estimate's of tt s and 
r's are clo.se to their true values. Consequently for these items the ('stimated .‘7„,,'s will be close 
to the true value's. Becau.se the estimates of tt's and r's are' far off the true \'alues for ite'uis 1. 2. 
3. 6. anel 8. Table's r> and 6 show the coinj^arisons of the estimated .87., 's and their true' value's for 
ihe'se' items. Table 5 compares the true values for .8T,'s with their estimates for ite'ins 1. (i. anel 
8. wdiile' Fable' b e-ompares the true' .s\ , with tlie'ir e'stimate's lor ite'ius 2 and ib From Fable's n 
anel G. we- se'e' that the e'stimatcel value's e>f are- clo.se te) their true- values for all tlu' pejssible' 

attribute' state's for these ite'ins (the average absolute de'viation betwe'e'ii the' true' anel estimate'el 
is 0.0186 fe)r 'Fable o ami 0.029.5 fe>r 'Fable 6), even though the' inelivielual tt and r e'stimate's 
are not cleise' ter the true values. .Note that in Table's 5 anel 6. w-he'ii elenotiiig tile attribute state's, 
we list only the attributes re'ciuired by the' ite'in. For example', in Fable 5 attribute' state' 10 is 
re'ally ( 1, 0. -*r. ^ for item 1. while it is (*,1,0,-*^.*-)^ ferr item G. whe're' * eh'uerte's it e-an be 

e'it lie'i' 1 e)i' 0. 



BEST COPY AVAILABLE 



Tabic 5. True and estimated Sq/s for items 1, 6, 8 





item 1 


item 6 


item 8 


attribute 


true 


estimated 


true 


estimated 


true 


estimated 


state 






1 








11 


0.720 


0.7200 


0.800 


0.8300 


0.720 


0.7000 


10 


0.360 


0.3700 


0.160 


0.2000 


0.000 


0.0000 


01 


0.080 


0.1224 


0.100 


0.0581 


0.360 


0.3500 


00 


0.040 


0.0629 


0.020 


0.0140 


0.000 


0.0000 



Table 6. True and estimated 5'a.j’s for items 2 and 3 



item 2 


item 3 


attribute 


true 


estimated 


attribute 


true 


estimated 


state 




■5... 


pattern 






111 


0.720 


0.7400 


111 


0.640 


0.6900 


110 


0.080 


0.0500 


no 


0.160 


0.0700 


101 


0.216 


0.2072 


101 


0.240 


0.3381 


on 


0.360 


0.3182 


on 


0.064 


0.0345 


100 


0.024 


0.0140 


100 


0.060 


0.0343 


010 


0.040 


0.0215 


010 


0.016 


0.0035 


001 


0.108 


0.0891 


001 


0.024 


0.0169 


000 


0.012 


0.0060 


000 


0.006 


0.0017 



By examining Tables 1 and 3, we see that the parameter estimates of 6j-, e,-. and d{ are not 
dose to their true values. But remember when d, is close to 1, we cannot expect to accurately 
estimate 6,- and c, separately because the data contains little information about these parameters 
(i.e., the likelihood surface is rather flat). That is, we are close to a condition of unident ifiability 
of the hi and c,. For those items with d,- close to 1, as discussed earlic'r a comparison between 
the true and estimated b{ — 2c,- is appropriate, and Table 7 shows they arc fpiite close. 



O 

ERIC 



hi 



16 



I'ahle 7. C'oniparison of true and estimated item [)arameters 



item 


true hi 


cst. 6, 


true Ct 


est. c,- 


true d{ 


esi, di 


true hi - 2ci 


est, bj - 'Icj 


1 


-0.3497 


-0.533 


0.90 


0.80 


0.90 


0.97 


-2.1497 


-2.133 


2 


-0.1929 


0.219 


0.70 


0.90 


0.95 


0.93 


-1..5929 


-1.581 


■1 


0.0189 


0.102 


0.60 


0.60 


0.95 


0.94 


-1.1811 


-1.098 


.'"i 


-0.1832 


-0.272 


0.90 


0.90 


0.95 


0.97 


-1.9832 


-2.072 


() 


0.9343 


1.165 


0.70 


0.80 


0.95 


0.85 


-0.4657 


-0..135 


s 


-0.1006 


-0.390 


0.90 


0.90 


0.95 


0.98 


- 1 .9006 


-2.190 


9 


1.0964 


-0.090 


0.90 


0.30 


0.95 


0.95 


-0.7036 


-0.690 


10 


0.2996 


0.770 


0.60 


0.80 


0.95 


0.85 


-0 900.f 


-0.830 



Ifemeinber the ultimate goal of our estimating the model i)ara'ueters is to enable us to 
classify the examinees cognitively. Since we are using the marginal maximum likelihood a])]u-oach 
to estimate tlu' model parameters (the calibration step, preliminary to tlu' classificat ion step), 
another way (tlu' right way) to look at the estimation accuracy of our model calibration procc’dun' 
is to compare the log likcdihood values given at the true and estimated ])arameters. Bc'cause of 
the possibility of a relatively flat likelihood surface in c(>rtain locations of th(' model parameter 
s])ace, different parameter s('ts might give approximately the same likelihood, llowcwc'r. if we 
use the estimated likelihood as input to an examinee cognitive' classification ])roceduri'. it is the' 
likelihood value rather than tlu' estimated parameter values that is c('ntral; thus nou-infhu'ut ial 
difference's in estimated parameter values are irrelevant. 

I'br the model we are ceuisielering. Figure 1 gives the pk)t e)f value's of the log likelihood fre)ui 
e'ach of the KM cycle's e)f a particular run using e)ur estimation ]rrogram. I he horizontal line in 
the' figure corresponds to the true h)g likeliluK)el, which is — 20(ivS8.(i6. From h'igure 1. we can 
se'e tiie le)g likelihoe)d values for the first se'veral K.\l cycles are' rai)ielly approaching the true log 
likelihe)ocl. Afte'r 13 or so cycle's the log likelihood value is already epiite stable. I'br the last 1") or 
so cycle's the bg likelihooel value's are increasing very slowly. I'he estimated log likelihood fre)iu 
the final KM cycle' is -20629.32, larger than but close to the> true log likelihooel (the re'ason why 
the estimate'cl log likelihooel is large'r than the' true' le)g like'lihoe)el for this particular run might 
he' elue- to the' coud)iuatie)n effe'e-t of estimation erre)r auel the ranele)iuue'ss we ha\'e' int roelue-eel in 
generating the elata). 



Figure 2. Plot ol ihc Log Likelihood Values from the EM Cycles 




mn number 



5 Summary 

The important need for test analysis methods that extract cognitive information useful to 
the practitioners from ordinary tests is widely recognized and is a topic of vigorous research 
in psychometrics and cognitive psychology. Such methods and underlying theory should be 
applicable to tests that are in common use today, as well as in the future to specially constructed 
diagnostic instrunu'iits based upon cognitive theory, in many cases computer administered. T1 k‘ 
goal of developing the Unified Model was to be able* to determine, on the basis of a simple test, 
what the cognitive strengths and weaknesses of an examinee are, relative to a list of cognitive 
attributes of interest in the particular educational setting of the test.. 

The Unified Model is theoretically appealing relative to other cognitive diagnostic models, 
but because of its structural complexity, there is not yet estimation package available. In this 
paper, we have proposed an estimation procedure for the Unified Model and have shown that 
it is not only computationally feasible but effective. With an effective estimation procedure for 
the Unified Model, we can calibrate the model, and thus classify and estimate examinee latent 




O 

i - . 



i 



18 



abilities, thereby extracting useful cognitive information about the test as well as the examinees. 



References 

Bartholomew, D. (1987). Latnit variable wod( Is and factor analysis. London: Charles GrilHii. 

Bock, R. D., and Aitkin, M. (1981). Marginal maximum likelihood estimatioii of item pa- 
rameters: application of an EM algorithm. Psyelwriu trika, -46. -443-59. 

Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum Likelihood from in- 
complete data via the EM algorithm. Journal of the Royal Statislical Socidy, H(.ri(.< B. 
39. 1-38. 

DiBello, L. V., Stout, W. F., and Roussos, L. A. (199;l). I'nified cognitivc/psychometric 
diagnostic assessment likelihood-based classification techniepu's. In P. D. Nichols. S. E. 
Chipman. and R. L. Brennan (Eds.). Cognitively diagnostic a.‘<s(.ssni( nt. Hillsdale. N.I; 
Lawrence Erlbaum. 

Haertel, E. (1990). Continuous and discrete' latent structure models tor item resj)ons(' data. 
Psychowetrika. 55. 177-91. 

Jiang, H. (1996). Theory and Applications of ('oniput at ional Stat islics in Cognitive Diagnosis 
and IRT Modeling. Ph.D. Thesis, I'niversity of Illinois at Lrbana-Champaign. 

Lazersfeld, P. F., and Henry, N.W. (1968). Lain}! slruclurr analysis. New York: lloughton- 
Mifilin. 

Michalewicz, Z. (199-1). Cenetic algointhins + data structures — evolution programs. Ih'rlin: 
Springer- Verlag. 

McCormick, G. (1967). Second order conditions for constrained minima. SIAM Journal on 
Applied Mathematics. 15, 6-11-52. 

Snow, R., and Lohman, D. (1993). Inijjlications of cognitive psychology for educat ional mea- 
surement. In R. L. Linn (FkL). Kducational measurrinent. New York: American Council 
on Education, Macmillan. 



19 







Takane, Y., and de Leeuw, J. (1987). On thf rolatioiiship between itorn rcspon.sc' theory and 
factor analysis of discretized variables. Psychomefrika, 52. 393-408. 

Tatsuoka, K. K. (1984). Caution indices based on item response theory. Psyciwnutrika, 49. 
95-110. 

Tatsuoka, K. K. (1985). .A probabilistic model for diagnosing misconceptions in the pattern 
classification approach. Journal of Educational Statistics, 12, 55-73. 

Tatsuoka, K. K. (1990). Toward an integration of item-response theory and cognitive error 
diagnoses. In Frederikseu. R. L. Glaser, A. M. Lesgold, and M. G. Shafto (Eds.), 
Diagnostic monitoring of skill and knoudedge acquisition. Hillsdale. NJ: Lawrence Erlbaurn. 

Tatsuoka, K. K., and Tatsuoka, M. M. (1987). Bug distribution and pattern classification. 
Psychomelrika, 52, 193-206. 

Yamamoto, K. (1987). .4 model that combines IRT and latent class models. Ph.D. 4'hesis. 
University of Illinois at Urbana-Chanipaign. 




20 



23 



